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ABSTRACT 


This paper explores the feasibility of fast transform 
coefficients as classification features for pulse type 
Signals. The fast transforms investigated are Fourier (FFT), 
Walsh (FWT), and Haar (FHT). A synthesized signal base 
containing 79 distinct pulse shapes of similar duration 
is analyzed for classification information compactness in 
the discrete time, Fourier, Walsh, and Haar bases. Non- 
parametric information measures are used. It is concluded 
that a Fourier basis representation enables the significant 
reduction of dimensionality necessary for further study as 


a generator of classification features. 
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DEFINITION 


A matrix having M rows and N columns. 
The matrix transpose of A. 


The signal space vector representation of 
the k-th signal of the m-th class. 


The elements of Se which are samples of the 
indicated signal taken at time instants 

nT, ne—_0, loess .s Neleand T is ¥thes siamp lie 
interval. 


The approximator or estimate of s\™ (nt). 


The transform space vector representation 
of the k-th signal of the m-th class. 


The leaps dimensional component or coefficient 
of ie ; : 


Dimensionality of the space concerned. 
Cardinality of signal classes in the space. 


Cardinality of signals inthe m-th class. 
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m k=l 


ne Estimated mean vector or 
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The n-th dimensional component of the Protytype. 
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TERM 


A/D 


FFT 


FHT 


FWT 


Global 


Class 


Cluster 


DEFINITION 


The n-th dimensional component of the 
variance vector of class m. 


The relative current probability that an 
observed signal should associate with class 
ie 


Covariance matrix. 
Correlation matrix. 


The i-th eigenvalue of the real-symmetric 
matrix C. 


Analog to Digital (continuous to discrete) 
conversion. 


Fast Fourier Transform. 
Fast Haar Transform. 
Fast Walsh Transform. 


The whole space, meaning consideration of 
all dimensions. 


The representations of signals from the 
same source. 


The collection of points in N-space formed 
by representations of signal of a common 
class. 





TERM 


Category 


Signal Space 


Transform Space 


Feature Space 


Coefficient 


Feature 


Metric, or 
Measure 


Prototype 


DEFINITION 


The collection of all possible classes to 
be considered. 


The N-~dimensional vector space equivalent 
to the discrete time domain. Its basis 
is the N-set of block pulses. 


The N-dimensional vector space with an 
orthonormal basis defined by the N 
transform basis functions. 


The R(< N) - dimensional vector space 
formed by discarding selected dimensions 
of another space. 


The projection onto a dimension of the 
transform space. 


A coefficient selected for use in the 
classification process. 


A function d(a,b) satisfying: 
1. d(a,b) > 0 with equality iff a =b, 
2. d(a,b) = d(b,a), 
SG Cay Ope cuuncy d(a.c 


The best estimate of the true representation 
for a class. 
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I. INTRODUCTION 


A. BACKGROUND 

Radio-fingerprinting or signal source identification has 
been regarded with varying degrees of skepticism over the 
years. Early attempts at radar fingerprinting were based 
on at most three parameters; signal carrier frequency (RF), 
pulse repetition frequency (PRF) or interval (PRI), and 
pulse width (PW). The receivers used for parameter measure- 
ments and operator skill differences produced errors great 
enough to mask subtle differences between individual radars 
of a type, and often veiled even type identification. The 
process was of course largely manual, and speed was a func- 
tion of operator skill and knowledge. And finally, since 
the data base was compiled mostly from the above observa- 
tions, the parameter value estimators were not always 
reliable. ) 

Studies by Stanford Research Institute (SRI) among 
others, in the early 1960's were influenced by the require- 
ment for greater speed and accuracy and stimulated by ad- 
vances in computer technology. Advancements in radar such 
as frequency agility and intra-pulse modulation dictated 
that measurements of the emitter scan characteristics and 
modulation type be added to the traditional parameters, RF, 
PRF, and PW. The emphasis however remained on type 


classification. 
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Signal fingerprinting with precision measurement of 
traditional parameters as the basis as well as some inves- 
tigation into classification by pulse shape began in the 
late 1960's. Bennett [1], [2] has explored a number of 
linear and nonlinear representations of pulse type signals 
on the basic investigative level. More recent work, as 
yet unpublished, addresses this problem in an applied 
manner using linear bases as does the research reported 


on here. ; 


B. SCOPE OF THE THESIS 

The introduction of various fast discrete transform 
algorithms and the versatile minicomputer has opened new 
areas in the realm of signal processing and pattern recog- 
nition. Although much work has been done on the application 
of fast transforms, most if it has been in the areas of 
image processing and two-dimensional character recognition 
[3], [4]. However, Bennett [1] included two of the three 
linear bases, and their fast algorithms, (Fourier and Walsh) 
considered in this research in his work on pulse represen- 
tation comparison. 

The intention of the work reported here is to investi- 
gate four orthonormal bases with ee: to their suitability 
in signal source classification using standard pattern recog- 
nition techniques. The discrete transforms selected are 
three of the class possessing fast algorithms, namely, fast 
Fourier transofrm (FFT), fast Walsh transform (FWT), and 


fast Haar transform (FHT). 


as 





— -. on oe 


1. Signal Synthesis and Data Base Collection 


In order to reduce the number of variables affecting 
a signal from a set of sources, a data set is synthesized 
rather than received from actual radars. The result is 
a set of 79 radar-like pulse trains of high stability and 
repeatability. Conversion from continuous to discrete form 
was performed in the laboratory under controlled conditions 
‘so that the only noise present in the data base is due to 
quantization error. 

The pulse synthesizer is modeled after the switched, 
open-line type of pulse forming network found in some early 
radars. An artificial (lumped element LC) transmission line 
tapped at each of its 13 section junctions is alternately 
charged and short circuited by a pulser circuit triggered 
by a conventional laboratory pulse generator. Fig. 1 con- 
tains a schematic diagram of the pulser. By jumpering the 
section taps two at a time the 79 pulse types were generated. 

A large number of pulses of each type (or class) 
were penvereed from continuous form to discrete sample values. 
A wide bandwidth (10 MHz) analog-to-digital (A/D) converter 
reduced each pulse to a 128 sample, 8-bits per sample repre- 
sentation. The digitized pulse data was then recorded on 
magnetic tape as the permanent record. 

2. Experimental Equipment 

All analysis work was performed on the prototype 

AN/UYQ-9(XN-1) or Parameter Encoder, a general purpose signal 


analysis computer system composed of a teleprinter, card 
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reader, graphics terminal, magnetic tape drive, and 1000K 
word disc file, as well as special purpose A/D and signal 
processing devices, interfacing with a mini-computer. 

Software-for this work was specially written for 
the purpose in Basic Fortran. Included are programs and 
subroutines to convert (transform) the data on magnetic 
tape to four 64-dimensional representations, namely, signal 
or block pulse basis, Fourier, Walsh, and Haar function bases 
in their discrete forms, and analysis programs which measure 
the classificatory value of these representations. 

Results of the analysis are presented graphically 
on the terminal cathode ray tube where they are either photo- 
graphed or processed by a special hard copy unit, or printed 
by teletypewriter. 

3. Theoretic Basis for the Thesis 

The question to which an answer is sought in this 
research is whether any of the rotations defined by the fast 
Fourier, Walsh, and Haar transforms are useful in a dimen- 
Ssionality reduction sense for the signal data set as pre- 
scribed, and is further investigation on more general signal 
sets and possible application warranted? 

The choice of bases is a good mix of properties. 
Both the Fourier and Walsh bases are global in nature, that 
is, each coefficient is a function of all coefficients (sam- 
ples) in the signal. The Walsh and Haar bases are closely 
related in general shape and by their generating process, 


while the Haar and signal or block pulse bases have the 
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common property of being local in nature, that is their 
functions are nonzero only on a portion of the signal (time) 
axis. 

If it is found that a certain transformation is 
able to represent the distinctive features of the entire 
category of signals with relatively few coefficients, then 
that transform will probably lend itself to an efficient 
classification process. 

To this end the signal data is projected onto the 
three ono bases and analyzed for classificatory 
information content and distribution. The methods and 
measures used are discussed in the context of this research 
and application. 

Although this work stops at feature selection, the 
only completely valid test for comparison of one set of 
classification features with another is performance under 
a specified classification rule. Specifying the rule which 
best fits the problem at hand is itself sufficient to be 
the topic of a thesis, and is not considered here. The 
literature contains both general and specialized studies on 
the subject of classification. 

Two feature selection metrics based on second order 
statistics are developed and applied to the data. Measures 
of a potentially more powerful nature such as those founded 
in information theory are not considered because of the 


requirement for knowledge of the distributions involved. 
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A brief discussion of rank order as a feature set is 
included and could also be the subject of additional work. 
The use of a ranked vector of feature indices may remove 
some of the detrimental effects of time reference shift on 
Walsh and Haar transforms as well as reduce the information 
to be processed by quantizing the feature space. 

4. Results and Conclusions 

The signal data base was discovered to contain 
Significant jitter or variation in the time position of the 
sampling window. The extent and effect of this jitter was 
not discovered until analysis of the results had begun. Due 
to time limitation, no attempt was made to reconstruct and 
reprocess the data. 

It is concluded that, in the presence of time jitter, 
the Fourier basis can result in significant dimensionality 
reduction, and that the Walsh and Haar bases offer little 
if any improvement over the signal samples themselves. 
However, in the absence of jitter an improvement in perfor- 
mance of the Walsh and Haar bases is expected. Considering 
the speed advantage of the FHT over the FWT and the FWT 
over the FFT, some eee to optimality might be indicated 


for real time processing. 
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II. EXPERIMENTAL PROCEDURE 


A. DATA BASE 

Early attempts at the construction of a satisfactory 
data base were oriented toward reception and digitizing 
of local radars. This approach, though esthetically satis- 
fying, proved impractical for this work. The data base had 
fo meet several criteria which ruled out the use of "live" 
Signals. First, the objective is to determine the feasibil- 
ity of fast transform method as generators of high quality 
classification features, and not to evaluate or specify a 
complete intercept system for the task. Secondly, there is 
the "completeness" problem, which is the requirement for the 
data base to span the range of pulse shapes expected to be 
encountered. The limited number of radars in the local area 
limits the completeness of the data. The alternative se- 
lected is to employ a pulse synthesizer consisting of a 
triggered silicon controlled rectifier (SCR) switch driving 
an open type artificial transmission line pulse forming 
network similar to those used in early radars [5]. Modern 
pulse formers are more likely to employ saturable inductances 
in a high level modulator, but will still produce component 
value dependent pulse shapes characteristic to that radar. 

Figure 1 contains a schematic diagram of the line pulser 
and its connection to the line and other devices, and also 


shows the ensemble of pulse shapes which comprise the data 
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base. The artificial line, originally constructed for 
laboratory experimental use, is tapped at each end and at 
PochimiuomeSoCCLOn junction for’a,total of 5 teps. The line 
characteristics and hence the pulse shape are altered by 
jumpering or short circuiting various taps, creating branches 
and shortening the length of the main line as shown in 


Figure 2. 
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Figure 2. Line Configuration for Class 2-7 


In forming the pulse ensemble, 13 taps, number 1 to 13, were 
exhaustively jumpered two-at-a-time for Ce) = 78 distinct 
line perturbations which with the "no jumper" or 1-1 config- 
uration yielded 79 distinct pulse shapes. Referring to 
Figure 1, note that adjacent pulses are similar both row 
and column wise, differing mainly in the position and shape 
of the perturbation. Although they may not accurately 
represent any given radar's emitted pulse shape, they do 


span a considerable number of possible shapes for pulses 
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of similar duration, and are sore dened a good base for 
comparison purposes. 

The electrical length of the line is 9 microseconds 
giving a maximum pulse length of 18 microseconds. Observa- 
tion of all pulse spectra showed that spectral components 
above 900 KHz are at least 50 db below the largest component. 
Based on this and a requirement for complete framing of the 
pulse in a 64-sample window, a sample rate of 3.0 MHz was 
chosen for digitizing. 

Digitizing is the process of converting the continuous 
voltage waveform output of the line pulser to equally spaced 
voltage samples converted to 8—bit (256 level quantization) 
binary numbers or words, and the recording of these samples 
on magnetic tape. The maximum rate of the A/D converter is 
10 MHz, placing the rate used well within device limitations. 
Each pulse digitization consists of 128 samples, and a total 
of 4096 pulses for each of two "identical" lines in each of 
the 79 configurations were committed to magnetic tape, and 
these make up the permanent data base. 

The final step in conditioning the data for analysis is 
framing of the pulses in windows of 64 samples each. The 
first sample or the beginning of the window should correspond 
to a constant amplitude point on the leading edge of the 
pulse, simulating a threshold crossing triggered sampler. 
This step is performed manually by placing the joy-stick 
controlled cursor of the graphics terminal on the desired 


point of the leading edge of a displayed pulse and commanding 
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a "store" operation. The algorithm finds and stores the 64 
samples following the cursor position in a file on the sys- 
tem's disc. Figure 3. illustrates the flow and form of the 
data during the data base building process. 

It is this last step with its human interaction, and 
the unsynchronized nature of the digitizing process which 
are the causes of the window time jitter and the consequently 


poor data from the non-time-invariant FWT and FHT. 
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Figure 3. Data Flow and Form During Data Base 
Construction 


B. DATA PROCESSING 

The sequency of operations on the signal data base are 
experimental in nature and are not designed for production 
type processing although some of the JioRoutines could be 


readily adapted for use in an operational program. 
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Programs written for the Parameter Encoder for this 
research are listed below with a description of their 
functions. 

oe ROgram LRED 

a. Function 
TRED prescreens and transfers the digitized data 
from magnetic tape to disc file. 
b. Description 
TRED reads the 128 samples per pulse data from 
a specified File and record on the magnetic tape, and 
displays sequentially and individually in graph form on the 
terminal CRT those pulses which exceed a preset threshold 
value. If the signal is noisy, that is, contains parity 
errors or is not of the desired type due to an error in 
jumpering the pulsed line, it can be rejected. If the data 
is suitable for further processing, the operator places the 
cursor crosshairs at the leading edge of the pulse. The 64 
samples following the leading edge are stored on the disk 
as 64 16-bit integers. The program may be terminated at any 
time. 
ee rrogzam ollT 
a. Function 
SIFT calls those signals stored by TRED and 
performs 1, 2, or 3 fast transformations on the data. The 
transform coefficients are then stored along with the signal 


data ina separate class structure file on the disc. 
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b. Description 

SIFT can optionally perform a second screening 
of the data enabling cursor positioning errors to be detected, 
or it may automatically and sequentially process any signal 
data located in TRED's file. The three transformation which 
can be performed are subroutines and are easily changed. 
All transform coefficients are normalized tito the average 
value (zero-th order coefficient) and then stored as 64 
floating point numbers in a disc file location in space set 
aside for that particular pulse type or class. Taetines of 


the fast transform subroutines are provided in Appendix C. 


3. Program MEVAR 


a. Function 
MEVAR calculates the class meam and variance. 
b. Description 
MEVAR uses the transform data stored by program 
SIFT to calculate the mean values of each coefficient of each 
transform for the signal type or class specified. The means 
are stored and then used to calculate the variance or second 
central moment of each coefficient and transform. These 
class data are stored in a third disc file. 
4, Program GVAR 
a. Function 
GVAR is a feature selector program calculating a 
measure of feature goodness based on the average fluctuation 


of class mean values weighted inversely to class variation. 


24 





b. Description 
GVAR uses the class data of program MEVAR to 
calculate a global central second moment from weighted class 
data. The results are presented in original coefficient and 
also in ranked order. 
5. Program FRAT 
a. Function 
FRAT is a feature selector program similar to 
GVAR. The measure of goodness it employs includes a weight 
which is a function of the number of members in each class. 
The results can be interpreted as a kind of signal to noise 
ratio where the signal is classificatory information and 
noise is the average within class variance of the signal 
transform coefficients for each dimension. 
b. Description 
FRAT uses the class average data of program 
MEVAR as does GVAR. The results are presented in original 


coefficient order and in rank order. 
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III. SIGNAL TO TRANSFORM SPACE - PROJECTION 


A. TRANSFORMATIONS AND CLASSIFICATION 

At this point the terminology and notation employed for 
the remainder of the thesis will be standardized and oriented 
toward linear vector spaces and the classification problem 
rather than to Brier cencests™ 

The question "why transform?" may be asked with some 
validity. Any operation on a signal requires time and 
expense. The answer, fundamental to the field of pattern 
recognition, is reduction of dimensionality. <A complete 
description of any possible signal representable in a space 
of dimension N requires all N dimensions. For signals 
emitted by a specific source, the N-dimensional representa- 
tions will be similar and will differ in some manner from 
representations of signals from another source. The problem 
of classification is how to measure this difference so that 
classification errors are somehow minimized. If all N dimen- 
sional projections contain significant information then all 
N must be included in the metric. However, if this signal 
space can be rotated somehow so that the information in 
some of its dimensions can be projected onto a single dimen- 
sion in another space, then the information has been com- 
pressed or dimensionality reduced. However, a rotation that 
works for one signal class probably won't work for all signal 


classes in the category of signals of interest. The criteria 
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and evaluation methods for a transformation are discussed 


in Section IV. 


B. THE FAST TRANSFORMS 

The primary reason for selection of the Fourier, Walsh, 
and Haar discrete transformations is the existence of fast 
algorithms based on elimination of redundancy [3], [4], by 
matrix factorization of the basis matrix. An N-dimensional 
peaecornatkon in general requires Ne real or complex multi- 
plications. A FFT or FWT requires but Nlog.N arithmetic 
operations (complex multiplications for the FFT and real 
additions for the FWT). A FHT because of its highly local 
nature (lots of zeros.in the transform matrix), requires 
only 2(N-1) real additions and N-2 normalizing multiplications. 

Another important reason for the selection of discrete 
Fourier, Walsh and Haar transforms is the difference in the 
basis functions of the transformations. Appendix B addresses 
the Walsh/Hadamard and Haar functions in greater detail. 

The Fourier and Walsh functions possess similarities such 
as the average number of sign changes per unit interval and 
even/odd symmetry which lead to the terms sequency, sal, 
and cal for the Walsh functions. Furthermore, the Walsh 
and Haar functions are closely related. 

A final comment on the translational invariance is in 
order. The Fourier basis representation is invariant under 
time translation while the Walsh basis is invariant under 
dyadic translation. That is, the Fourier magnitude coeffi- 


cients do not change when the signal data samples are 
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eyclically translated, that is, 


SS (Css S ) 


Opler ~=N—1 


= (Soe) ?S(1ex)?***?5(n-1ex)? 
where © indicates modulo(N) addition. This is equivalent 
to sliding the signal in the reference frame. Walsh coeffi- 
cients do change under this type of translation but are 
invariant when signal data are translated or reorder according 
to the mod(2) bit-—by-bit sum of the original index and the 
translation constant, k. 

Ss, = (Ss 


-..000? >...0012 °°? 53...49? 


S 


2 = (S¢ | gooex)? 5¢...001ek)? °°°? 5¢1...110K)? 


where ® now indicates modulo(2) bit-by-bit addition and k is 
an integer expressed in binary form. For this application, 
dyadic invariance is not beneficial, but if time translation 
is minimized this drawback is not serious. The Haar transform 
is also not time invariant. 

Figures 4, 5, 6, and 7 are plots of the 64 dimensional 
representations of the data for the 79 pulse classes 1l-2, 
1-3, 1-4, ..., 11-12, 11-13, 12-13. Referring to Figure 1, 
the class sequence progresses up the columns moving from 
left to right. The spaces are signal, Fourier (magnitude), 


Walsh, and Haar in Figs. 4, 5, 6, and 7 respectively. 
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IV. DIMENSIONALITY REDUCTION — FEATURE SELECTION 


A. PURPOSE 

A main principle in pattern recognition is the elimina- 
tion of redundancy and useless information in the given data 
so that the classifying algorithm can make efficient use of 
both time and machines. This elimination process is dimen- 
-sionality reduction, and the process itself is commonly 


termed feature selection [3], [4], (€7]-[9]. 


B. FEATURE SELECTION 

The projection of an N-dimensional signal vector repre- 
senting an N-sampled time function from the signal space 
to a transform space by means of a complete orthonormal 
transformation does not in any way inherently reduce the 
dimensionality of the representation. However, a transfor- 
mation of this type can be viewed as measuring the correla- 
tion between the signal and each of the N basis functions. 
Hence it seems reasonable to assume that, given a certain 
category of signals, certain orthonormal transformations are 
more efficient than others in the sense of requiring fewer 
coefficients to attain whatever the objective may be. 

If the objective happens to be representation of the 
Signal in more compact form, then perhaps all transform 
coefficients smaller than some threshold value could be 
eliminated, resulting in a reduction from N to, say, K 


dimensions. Then the representation obtained from the 


SiS) 





inverse transformation back into the signal space is the 

best Sy approximator of the original signal in terms of 

that orthonormal basis. See Appendix A. The "closeness" 

of this approximate representation is commonly measured in 
terms of mean square (or energy) error (MSE), which in vector 
space context is the squared Euclidean distance. This error 
is given by | 


eat 
MSE == ¢£ (s(nT) - s(nT))° 


" n=0 
where: s(iT) are the original signal samples 
s(iT) are the signal approximator "samples" 


T is the sample interval. 


For the purpose of classification of signals the elimina- 
tion criteria are different, and the MSE of the before and 
after representations is not necessarily a good measure. 
some of the most distinctive characteristics of a signal may 
contain very little energy. Their elimination causes little 
energy error but a large loss of classificatory information 
in the reduced representation. 

Consider the signals s, (nT), Me-S0.@le 2. ..., Nel, 
=e. oe Or einatineetrom M << tesource classes all from - 
the category of interest (pulsed signals from different 
sources of the same type). The N-dimensional vectors formed 
by the signal samples define I points in the transform 


space which will tend in some manner to form M clusters 
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representing the M source classes. For a given orthonormal 
transformation the I points will project onto each of the 

N basis vectors and clustering to some extent will occur in 
each dimension. The dimensional cluster for, say, class m, 
will exhibit some spreading which is related to the manner 
in which signal perturbations and system noise project onto 
that particular basis vector. Another class, M5 » will 
Similarly cluster on that dimension with some spreading. 
The difference between the cluster mean values is a measure 
of that dimension's classificatory information, the use of 
which in classification is degenerated by the intra-class 


spreading. 





Figure 8. Hypothetical 2-Class Projection 
onto 3 Orthogonal Axes. 


Figure 8 is a 3-dimensional, 2-class hypothetical example. 


The projections of both classes Mm and m on basis axes ts 


and t, exhibit small spreading, however the cluster separa- 


3 


bility on axis t,. is clearly greater than on axis er The 


i 
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projections onto axis to are widely spread and even though 
the mean values differ considerably, no separability exists. 

This illustration suggests a class of feature selection 
metrics based on the concept of signal (information) to 
noise ratio. The two feature selection metrics investigated 
in this thesis are both of this type. The first is simple 
and intuitive, used primarily for purposes of illustration. 
This metric is incorporated in Program GVAR and can be 


expressed as 


Ga = —— 
n M-1 m=1 ae m 
n 
where: 
hea “is the estimated n th dimensional 


mean for class m, 


. is the estimated hie dimensional 


average of class mean estimates, 


gain) is the estimated nee dimensional 
variance for class m, and 


M is the number of classes in the 


category. 


There is no compensation for differences in cardinality of 
class populations, and it is sensitive to round-off errors 
encountered when the dimensional projection means and the 
variances are nearly equal and very small as would occur 


if the basis vector for that dimension were orthogonal to 
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everything in the signal vector. This instance occurred 
in the case of the Haar basis, some functions of which are 
non-zero only in regions where the signal is either zero 
or a constant. | 

This test is simply the average of the ratios of squared 
class mean deviations from the global average to class 
variance. Because of the sensitivity to computational 
errors, the results may be misleading. It does lead natu- 
rally to a more powerful and less sensitive variance ratio 
test incorporated in Program FRAT. 

This latter test is a modified form of the Snedecor 
F test so called for Fisher on whose Z distribution the test 
is based [10]. Snedecor's F test as used here provides, 
in addition to a relative goodness number, a confidence 
percentage that the variance among class mean values is not 
due to the average intra-class variance (or noise). How- 
ever, it is modified slightly to reflect the relative 


probabilities of occurrence of a class. The metric F is 


given by 
M 
Th a(m) a \2 
M-1 2 PabHn S uy? 
ee = uM 
1 2(m) 
M a Pn ©n 
M M 
1 a(m) a 2 
aT ees, Cu i) Jae OK 
P M-1 m=1 ™ 2 n a 
M M 
5 EK Bd Ka 
m=1 ™ m=1 
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where: 

p is the relative probability of occurance 
of class m (more properly that an observed 
signal came from source class m), and 

Kk is the number of signals in class m. 

A comparison of the results of the G test and the F 
ratio test indicate that the latter is not as sensitive to 
data and computational problems. 

Signal-to-noise or variance ratio type tests are not 
the only metrics for feature selection. Several information 
theoretic approaches have been applied to multiclass classi- 
fication [7], [8]. There are other, perhaps more elegant 
methods, applicable to the two class problem or the clustering 
problem [9]. 

From the feature selector algorithm results a subset 
of coefficients is chosen which can be tested further for 
optimality. Of course the only valid test is minimization 
of classification error, a test not performed here because 


er time limitawlons® 


C. COVARIANCE AND CORRELATION 
While feature selection tests will in general measure 
the classificatory information a feature for dimensional 
projection) contains, they are not sensitive to the kind of 
information but rather only to the average net accumulation. 
If the pulse signal classes of concern have hypothetical 


linearly independent details, say A, B, C, and D, which occur 
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in various linear combinations to characterize the classes, 
then the optimal linear orthogonal transformation which can 
we performed on the signal data is the one which is able to 
project each detail onto its own dimensions. Restated, let 
the basis vectors of the transformation be generated from 
the signal details so that the projections in the transform 
space are mutually uncorrelated. Sebestyen [9] proves that 
this transformation (followed by a diagonal feature weighting 
transformation) is the optimum linear transformation for 
feature generation. This transformation is variously called 
Holelling's Method of Principal Components, Karhunen Loeve 
Transform, and factor analysis. The matrix defining the 
transformation is the matrix Yo of the signal set. 

rE 


P’yP = diag(a sry o-++sAy_y) 


where the hs are the eigenvalues of 2 and 


Sy 


P is the matrix of eigenvectors corresponding to the 
eigenvalues, Aas 

While this transformation would appear to be the solution 
to the problem, there are aspects of pulse source classifi- 
cation which nullify its attributes. Most important is the 
fact that it is a complete orthogonal transformation only 


for the signal from which it was generated. New features 
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of new signal source classes will be undetected unless 

they contain a linear combination of one or more of the 
transform basis vectors. Secondly, since the basis is data 
dependent and not composed of a fixed set of orthonormal 
vectors, no factorization and hence no fast algorithms 

are possible. This means that the transformation will 
require Na real multiplication operations and that unless 
‘the feature space can be greatly reduced, an application 
where speed is important cannot use it. 

The covariance matrix Yer is calculated for the reduced 
feature sets derived from the three fast transforms inves-— 
tigated. These are presented in the next section in nor- 
malized form as correlation matrices. Ideally, feature 
vectors of all signals in all classes, i.e., all observa- 
tions, should be used in the calculation of a global corre- 
lation matrix; however, due to machine limitations, only 
the class mean feature values were used since they are the 
best statistical estimate of actual feature values. The 
covariance matrix is then: 


y ae) a a(m) 
Yo = Iam Yn en 


= 0 dy 


where i,j range over l, 2, ..., N independently and k,2 


range over l, 2, ..., M independently. 
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The correlation matrix is obtained by normalizing all 
elements to the inverse square roots of the diagonal 


elements which are the global variances of the class means. 


: rye at) 
Thee iad 


op 
NxN 


where i, j, k, 2, are as defined above. 


Off diagonal elements C reflect the degree of 


iJ 


correlation between features of index i and j. 


D. RANK ORDERING 

To this point only continuous measures in continuous 
vector spaces have been considered. It is possible that a 
discrete space might be entirely suitable if not superior 
when the inter-class distances and intra-class variances 
under a discrete metric are such that a quantized space 
does not increase classification error. 

Consider the case of ordering the features, selected 
for their information content and derived from a complete 
orthonormal transformation of the signal space as above, in 
decreasing value order. If the reordered feature indices 
rather than the feature values are used for classification, 
the information rate between the signal processing device 
and the classifier could be reduced considerably. The 
classifier itself could possibly be simplified. 

Using the data of this thesis for example, data samples 


are 8-bit integers and the projections in the transform 
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space are floating point numbers requiring 32-bits. If the 
number of features used for classification is 16, then, for 
each pulse observation 5l2 bits must be sent to and processed 
by the classifier. Now if only the rank orders is preserved, 
the 16 features are represented as 4-bit integers and each 
pulse observation results in transmission and processing of 
64-bits. For a given channel bandwidth, significantly more 
information could be sent per unit of time if a suitable 
classifer can be found. | 

The feature Space becomes quantized with N! = N(N-1)(N-2) 
~--(2)(1) distinct points corresponding to all different 
possible orderings of N features. For the case N = 16 there 
are more than 2 x 1013 Gistinct points. The 3-feature space 


is illustrated below. 





Figure 9. 3-Space Representation of all Rank 
Ordered 3-Vectors (I, ,1,,I,) 


Ho 





Pr Bi 


There are tests which can be applied to ranked sets 
which could find application to this problem. Moroney [10] 
discusses several in the context of evaluating judges asked 
to rank things in order of quality. A test which evaluates 
the degree of agreement within a group of rankings (a class 
cluster) compares the mean squared difference of perfect 
agreement ranking and the expected ranking. The expected 
ranking is the average of all possible rankings and is indeed 
not a ranking at all, but an N-vector with all entries equal 
to N(N+1)/2. The result is a number between 0 and 1 called 
the Coefficient of Concordance by Moroney. This test 
might find use as a feature evaluator since it provides a 
measure of intra-class fluctuation. 

Another test measures the correlation between two 
rankings and yields a number, R, between -l and +l given 


by the empirical appearing formula 


N 
a Che 
a n=l 
R=l1- 5 
N(N™ - 1) 


where qd, is the difference between ranked indices. R is 
called Spearman's Rank Correlation Coefficient and might 
be employed in classification, measuring the correlation 
between an unknown ranking and the mean ranking of classes 
taken one-at-a-time. 

Rank ordering was not considered in this investigation, 


but it appears to warrant further study. 
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V. DISCUSSION OF RESULTS AND CONCLUSIONS 


The intention of this research is to explore the feasi- 
bility for generating classification features for pulsed 
signals by linear transformation using so-called fast algo- 
rithms. The underlying premise is that the pulse generation 
‘mechanisms of distinct sources impart sufficient information 
to the pulse (envelope) shape to allow classification on 
this basis. A complete orthonormal transformation process 
eannot create information, and, by the completeness property, 
does not destroy it. The hypothesis is that such a trans- 
formation will result in a more efficient distribution of 
classificatory information than is inherent in the signal. 
Restated, the pulse shape representation in signal space 
requires consideration of more dimensions for a specified 
classification confidence level than does some transform 
space defined by a fast discrete method. 

In Section IV it was stated that the Karhunen-Loeve 
transform is optimal for a closed, invariant set of features, 
and results in the least dimensionality for a specified 
error tolerance under a MSE metric. It does not, however, 
meet the fast algorithm requirement. Thus it is sought to 
determine if the FFT, FWT, or FHT results in a compacting of 
classificatory information significant enough to warrant 
further investigation and possibly application. 

The discrete representation results in a dimensionality 


of 64 for the signal space and each of the transform spaces. 
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Using the second order statistics of the 79 signal classes, 
and treating the projection onto each dimension as a classi- 
fication feature, two measures of information content were 
applied to each of the transform spaces. The F Ratio metric 
was then applied to a subset of 20 signal classes to provide 
a comparison between the three transform spaces and the 
Signal space to substantiate the hypothesis that a trans- 
formation (or rotation of the space) can result ina more 


compact representation for classification purposes. 


A. INTERPRETATION OF DATA 

A complete listing of numeric data is presented in 
Appendix D. 

A comparison of the transform class prototypes, that is, 
the class estimated centroid in N-space, is shown graphically 
in Figures 10, 11, 12, and 13 for signal, Fourier, Walsh, 
and Haar representations. These figures consist of 79 
superimposed curves consisting of lines connecting data 
points which are signal samples for Fig. 10 and transform 
coefficients for Figs. ll to 13. The data are scaled 
differently for illustration purposes. 

The collection of points of intersection of the over- 
laid curves and a line drawn vertically from any index 
point, n, gives one an indication of the distribution of 


Thad on the noh dimension. 


the class mean values, 
Figures 14, 15, and 16 are graphs illustrating the 
measure of classificatory information and its distribution. 


The horizontal axis is calibrated by index of decreasing 
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rank of magnitude in test results, not in original coefficient 
index order. Tables Dl to D10 of Appendix D list the 
numerical values in both original coefficient and rank order. 
1. Comments on Signal Data 

Before a meaningful comparison of any data can be 
made it must be normalized or scaled to some reference. 
In the case of the transform coefficients, this reference 
is the zero-th order coefficient or average value of the ~ 
Signal. The same reference is used in the feature selection 
tests. In Rieure 10, each curve is scaled to have the same 
maximum value which may be misleading. 

The superimposed curves show that there is consider- 
able error in estimating the leading edge of the pulses 
from which the prototypes are estimated. By linearly 
extrapolating the estimated actual pulse origin, it is 
apparent that an error on the order of 8% of the average 
pulse width is present. That it appears in the class proto- 
type indicates an inconsistency in the leading edge deter- 
mination process which is manual. A threshold crossing 
decision would have minimized this error which undoubtedly 
affected the Walsh and Haar data due to the non-time-invari- 
ant nature of these transformations. 

- To illustrate the relative effects of time window 
jitter and quantizer noise, a pulse class (9-11) was selected 
at random for inspection of each signal and transform used 
to generate the class prototype. Figures 17 - 20 show the 


Signal, Fourier, Walsh, and Haar coefficients of the class 
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as superimposed curves. In Figure 14 both time jitter and 
quantizing effects are apparent. The FFT data, Figure 18, 
shows no visible coefficient variation, while FWT and FHT 
data, Figures 19 and 20 respectively, show that some coeffi- 
eclents are quite noisy. The Haar functions of index oe 
m=1, 2, .-., 5, are non-Zero only during the first Shee 
signal samples and thus reflect the effect of time jitter 
to the greatest extent in their respective coefficients. 
Results of the F-Ratio test performed on the signal 
sample data for the 20 classes 6-11 through 10-12 (alee 
Fig. 16) indicate that most of the information of classifi- 
cation value — as determined by this metric — is distributed 
fairly uniformly over 32 of the 64 samples. Figure 21 shows 
how the information for these classes is distributed in 
signal space (time) order. This is somewhat surprising in 
that the leading edge region is considered by this metric to 
be useless while the latter midsection and trailing edge 
region rates high. This result is believed due to the lereee 
variance in edge data caused by the time jitter mentioned 
above. The trailing edges are affected on an individual 
class basis rather than globally, which does not tend to 
lower the average for the whole ensemble. Given an accurate 
time-of-arrival (TOA) estimate it is conjectured that the 
leading edge region would rank high also. This would tend 
to increase the necessary dimensionality by including more 


samples in the "good feature" category. 
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2. Comments on Transform Data 

Because of the magnitude operation on the Fourier 
Sine and cosine pairs, the number of unique coefficients is 
reduced by half. This operation is time consuming but 
results in time-invariant features which, in light of the 
. jitter present in the data base, would tend to favor the FFT 
in this comparison. Not so fortunate are the Walsh and Haar 
bases, both of which are affected by time reference variation. 
Figures 14 and 15 compare the information distributions in 
the three spaces while Figure 16 includes signal data as well. 

From Figures 14 and 15 it is apparent that the Fourier 
basis has several clear advantages. Most of the useful 
information is in the first 12 coefficients. Not only is the 
information concentrated ina few features, but that infor- 
mation is a monotone decreasing function of index. Thus 
the order of the transform, N, and the time of execution can 
be reduced considerably. For an order reduction R, which is 
a power of 2. the number of arithmetic operations is reduced 
by R logok. For the case considered here the savings in 
arithmetic operations amounts to a reduction factor of 8. 

The FWT and FHT data are difficult to interpret due 
to the time jitter. All Walsh coefficients are global in 
that they are functions of all signal data points, whereas 
all Haar coefficients except the first and second are local. 
See Appendix B. The first two Haar functions are identical 
to the first two sequency order Walsh functions and hence 


Will generate identical coefficients. 
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Time jitter may have two effects on the FWT coeffi- 
cients. It will certainly produce a variation in coefficient 
values which would reduce their effectiveness as classifica- 
tion features. Furthermore, in the case of higher order 
coefficients, this variation might tend to make the cluster- 
ing multimodal. The variance ratio feature selection tests 
used in this work fail when clusters are not unimodal. This 
may explain why so many of the Walsh coefficients have large 
apparent information content. 

The similarity of Haar functions to both Walsh 
functions and so-called block pulses (which are the set of 
basis functions for the signal space) is apparent in Figure 
16. The Haar coefficient curve is similar to the Walsh 
coefficient curve for those features of high information 
content and to the signal sample curve for those of little 
apparent information. 

Condensed correlation matrices for the three trans- 
form spaces are shown in Tables Dll to D1l3. Only the eight 
features having the highest classificatory information as 
determined by the 7/9 class Bfuevtilo tesu are aineluded: 
Evident is a high degree of correlation between FWT and FHT 
coefficients which may be due to multimodal clustering or to 
poor resolution of a given signal detail by anything but an 
extended linear combination of Walsh or Haar basis functions. 
In this respect the Fourier basis also excels as evidenced 
by much smaller, but still considerable, inter-coefficient 
correlation. Once again, this may be due to the time 


invariance of the Fourier basis. 


Dis 





23 Gp 
i == 
a 
—— 


B. CONCLUSIONS 

On the basis of the results of this investigation it is 
concluded that the Fourier basis as represented by the FFT 
can produce a dimensionality reduction factor of 5 or 6 for 
the signal data base employed. If actual pulse signal 
emitters of a common type display this degree of pulse shape 
dissimilarity phen efficient classification should be 
possible on the basis of signal envelope shape. The effects 
of additive noise, multipath propagation, and signal distor- 
tion resulting from pulse-to-pulse amplitude riiation and 
a nonlinear (square-law) detector were not investigated and 
would certainly degrade the value of the selected features 
for classification purpose. 

No positive conclusions can be drawn from the Walsh and 
Haar transform results due to the jitter present in the sig- 
nal data base. Further investigation may show that in the 
absence of time window jitter one of these transforms may 
exhibit the capability for dimensionality reduction to an 
extent that its use as a feature generator is feasible. 

The fact that the FWT and FHT are extremely fast makes them 


highly desirable for real-time processing. 
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F-Ratio Test Information Measure of 
Representations in Three Bases for all 
79 Classes as Functions of Test Rank Index 
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Figure 15. G-Variance Ratio Test Information Measure of 
Representations in Three Bases for all 
79 Classes as Functions of Test Rank Index 
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Figure 16. F-Ratio Test Information Measure of Four Basis 
Representations for 20 Classes (6-11 to 10-12) 
as Functions of Test Rank Index 
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Figure 17. 
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Figure 18. Overlay of the 25 Sets of FFT Coefficients 
of the Signals Defining Class 9-ll 
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Figure 19. Overlay of the 25 Sets of FwT Coefficients 
of the Signals Defining Class 9-11 
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Figure 20. 
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Figure 21. F-Ratio Test Information Measure of Signal 
Samples for 20 Classes (6-11 to 10-12) as a 
Function of Sample Index 
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APPENDIX A 


LISTING OF FAST SUBROUTINES 


SUBROUTINE FFICM,REAL,SIGNF) 
DIMENSION S(2,64),RINC64),REAL(64) 


FAST FOURIER TRANSFORM 
~ M - LOGZCNOMBER OF SAMPLES) 
REAL = I/0 ARRAY 
SIGNF - DIRECTION OF TRANSFORM 
OUT PUT IS IN MAGNITUDE SQUARED FORM 


N = 2e*M 
NHALF = 
FLOTN N 

PIARG = 622831853 / 
DO 16868 I=1,N 
RIMGl) -=* 0 

DO 36606 I=I1,™ 

NO@J = 2**(CM-I) 

NI = 2**(CI-1) 

DO 2008 J=1,NI 


N/2 


FLOTN * SIGNF 


INCI = CJ-1l1) * Nel 

THETA = FLOATCIN2I) * PIARG 
C = COSCTHETA) 

SI = SINCTHETA) 

DO 2882 K=1,N2I 
-IN@ = K + IN2I 

IN] = K + 2*IN2ZI 

JINe2 = INI + Nel 

IN3 = INO + NHALF 

COMPLEX MULTIPLY 
C * REALCIN2) = SI * RIMCING) 


GORE = 


Cl = SI * REALCIN2) + C * RIMCIN2) 
SC1,ING) = REALCINI) + CR 

S(2,ING) = RIMCINI) + CI 

S(1,I1N3) = REALCINI) - CR 

S(2,I1N3) = RIMCINI) - CI 

CONTINUE 

DO 3029 L=1,N 

REAL(L) = SC1,L) 

RIMCL) = S(2,L) 

CONTINUE 


COMPUTE MAGNITUDE SQUARED 
DO 4808 I=1,N 
REALCI) = REALCI)*REALCT) 
RETURN 
END 


+ RIMCID*RIMCI) 
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9001 
882 
9903 
 GOB4 
go> 
gOBs 
gaol 
2888 
2029 
9610 
9011 
gG812 
GO13 
9214 
GG15 
QO16 
8217 
Q018 
QB19 
2020 
OG2 1 
G022 
20235 
QB24 
ZE25 
ZB2E 
G027 
8028 
QO29 
2030 
2031 
ZB32 
OB33 
QZ34 
9235 
236 
Q837 
QB38 
0039 
BBA 


FTN 


gqQdraay 


1030 
1520 


20008 
53808 


SUBROUTINE FWICM,X) 
DIMENSION X(1) 
FAST WALSH XFORM 
~M - LOGZCN) 
N - NUMBER OF SAMPLES 


X = I/0 ARRAY: ClsN=1/703 CN+132N)=SCRATCH 


N= 2**M 

NH@=9N Ved 

LR = @ 

DO 1008 L-1,™ 
LPs Let a 

LM =Le 1 

LR = N= LR 

LT = N = LR 

NY = @ 

NZ = 2k*LM 

NZ =e Ze *— NZ 
NZN = N / NZI 
DO 1000 I-1,NZN 
NX = NY + 1 

NY = NY + NZ 

JS = (I-1l) * NZI 
aD = JS +2,NZ ie+ | 
DO 1800 J=NX,NY 
JS =< JS + 1 

“i w= eel) ES NH 

PUD aL Fredo 

Pi) come t+ 
Pluie <eitet lL 
AVeEIS Imam ACLIg? + ACLIJT) 
JD = JD=- 1 

aD =a Lie teee 
XC(LJD) = XCLTJ) - XCLTJT) 
IF € LR ) 1508,30800, 1580 
DO 2608 I-1,N 

IPN =- I + N 

XCI) = XCIPN) 
RETURN 

END 


64 





0001 
OB02 


B205 
8204 


8205, 


BBB6 
GB07 
G208 
g009 
8810 
G11 
GB12 
GH13 
G014 
Q@B15 
216 
Q217 
9018 
BBI9 
80220 
8221 
GB22 


| 8823 


0224 
8025 
BW26 
0327 
9828 
B22S 
B23G 
0931 
8032 
Q855 
0034 
G35 


FTN 


MIAQaqaIaqaqaananna 


1000 


2822 


3888 


ABBL 


SUBROUTINE FHT(M,S) 
DIMENSION S(64),H(64) 


FAST HAAR TRANSFORM 


M- LOG2C(NR OF DATA POINTS) 
S - I/O VECTOR OF LENGTH 2**M™ 
H - SCRATCH VECTOR 


FHT REQUIRES 2¢CN-1) REAL ADD OPERATIONS 


N=2** ff 

NH= WN 

DO 4806 I=-1,™ 
NH=NH/2 

DO 18088 J=1,NH 

1 se 

I2-J+NH 

| ata 

Ji=sJ2-1 

iil =o lt S¢ Jo) 
HCI2)=SCJ1I-SCJ2) 
CONTINUE 

NH2=NH*2 


DO 2888 J=1,NH2 


SCJIZHCI) 

CONTINUE 

GO TO (€4800,3200)I 
NH21=NH2+1 

DO 4808 J=NH21,N 
SCJI=SCJI*1.414215562 
CONTINUE 

RETURN 

END 
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APPENDIX B 


WALSH AND HAAR FUNCTIONS AND MATRICES 


The increasingly familiar Walsh functions and the less 
well known Haar functions originated in the early 20th 
century. J. A. Barrett, as described by Fowle [11] was 
perhaps the first to discover Walsh functions, using them 
as the basis of a telegraph wire transposittion scheme to 
reduce crosstalk. J. L. Walsh in 1923 [12] formalized the 
set of complete orthogonal bivalued functions defined on the 
unit interval [0,1] which now bear his name. An important 
orthogonal but incomplete subset of the Walsh functions are 
the square-waves known as Rademacher functiions after H. A. 
Rademacher [13], who developed them as part of a unified 
theory of orthogonal functions in the early 20's. 

Much of the recent interest in applicattion of Walsh 
functions was stimulated by their adaptability to digital 
Mrocessing. For example, a discrete Walsh matrix, like the 
discrete Fourier matrix of sampled sinusoids, contains the 
symmetry and redundancy required for a PaGen eraristorm algo- 
rithm based on matrix factorization. Becawse of the bivalued 
nature of the functions, the fast Walsh transform Be ane 
Walsh function based processing is inherently suited to 
digital implementation. Harmuth [14] proposes many and 
varied uses for Walsh functions in applications from signal 


processing to communication data multiplexing. 
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The set of Walsh functions of order N = 2” for all 
non-negative integers, n, forms an Abelian group under 
multiplication. That is, the product, equi-argument wise, 
of any two functions of the set is another member of the 
set. The first eight Walsh functions are shown below in 


Figure 22. 


Ho] 
wf 
wan) —— | 
van ——f——._ 
vp 


‘Se el ee ee 


0 - 0.5 1.0 


Figure 22. Continuous Walsh Functions of Order 8 
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The ordering shown here is the so-called sequency order 
after Harmuth who defines sequency as the average number 
of zero crossings per unit interval, (0,1). 

Harmuth chooses to define the Walsh functions on [-,+] 
and employs the notation Cal(s,x), Sal(s,x) to accentuate 
the symmetry similarities to the sinusoidal evPoneetrie 
functions. Sequency, s, is now defined as one-half the 
average number of zero crossings per unit interval [-%,+%). 

The discrete Walsh functions, W (i,k), i = 0,1,...,N-1 
and k = 0,1,...,N-l1 are formed by sampling the continuous 
Walsh functions at N equally spaced points on the interval 
of definition. The discrete form is most conveniently 


shown in matrix form as in Figure 23, below. 


+ + + + + + + + Wg (0,k) 
+ + - - - - We(1,k) 
+ + - =- =- =—- + +4 We (2,k) 
+ + =- =- + + = = We(3,k) 
Me fe = = tt = = LF [Wey 
+ —-— = + = + + = We(55k) 
+ - + = = + = + We (6,k) 
+ - + —- + += + = We (75k) 


Figure 23. Walsh Sequency Matrix of Order 8 


The Haar functions form a complete orthogonal but non- 
orthonormal set of bivalued functions on [0,1], and were 
first published by A. Haar in 1909 [15]. This set is related 


to the set of Walsh functions as pointed out by Fino [6], 
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but appear considerably different. The orthogonal Haar 
functions attain values tl, -1l, and 0 as shown below in 
Figure 24, which clearly illustrates the increasingly 
local nature of higher orders. The literature indexes 
Haar functions by a subscript and a superscript, a system 
which provides insight to the shape of a function from its 
indices but is somewhat clumsy for this work which employs 


@a single subscript index. 


es F 
= 
l 

ee et 


Figure 24. Continuous Haar Functions of Order 8 
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This orthogonal set may be normalized by multiplying each 
orthogonal function by aa Vs where k is the sub-index 
in oy. 

The discrete Haar functions of order N are formed, like 
the discrete Walsh functions, by sampling the continuous 
functions at N equally spaced Soares on the interval of 
definition. The N-square Haar matrix formed of the first 
N discrete Haar functions is shown in Figure 25 in orthogonal 


form. 


+ + + + + + 
+ + ee 
+ + - - 0 0 0 0 
He = OOO” Oh ta ee 
+ - Q 0 0 0 Q Q 
Oo 0 + - Q Q Q Q 
0 G0 OF +e — eo, 0 
0 0 0 Q0 Q Q + = 


Figure 25. Orthogonal Haar Matrix of Order 8 


Both Walsh and Haar matrices contain high redundancy 
which has led to not only the fast transform algorithms but 
to a variety of generating methods based on their internal 
symmetry, [6], [16]-[20]. The fast Walsh and Haar algorithms 
used in this research are adapted from papers by Robinson 


[18] and Rejchrt [20] respectively. 
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APPENDIX C . 


GENERALIZED FOURIER SERIES 


Consider the infinite dimensional signal (vector) space, 
S, consisting of all continuous physically realizable signals 
me Unetions pdefinedscnea < x < De On thissspace is defined 
an inner product or projection operation | 
b 


-£°%e= Sf f(x)e(x) ax. 
ce) : 


S contains orthonormal systems of infinitely many vectors. 


Let E = {o> G12 +++s &»> ..-} be one such system. The 
orthonormality condition states that the ey satisfy 
2 Peel or az | 
fp ° £5 = 943 = 1 for ai = 3 


for all non-negative integer indices i and j. 

An arbitrarily chosen signal, s(x), in S can be repre- 
sented by sequentially nested subsets of E, each of which 
spans a subspace of S. For clarity it should be noted that 
any segment of the real line can be considered an infinite 
dimensional vector space, hence s(x) can be expressed as 
s, depending on Che context. 


The signal (vector) s possesses a "best" Sy approximator 


Si in the subspace Sy Spanned by Ey and is given by 8) = Po£o> 
where py = (s9€,) is the projection of s onto ¢,. In other 


(a 





words, Po&q is the vector in By which is by some measure 
closest to s of all vectors in E,- Similarly, § possesses 


a "best" S, approximator in E. 


85 = PoSo t Pik} 


T 
oct 
wa 
g 
O 
ne 
a0) 
+ 
vV—<_-_ 
Ne 
9 
20 


and a "best" Sy approximator computed in the same manner 


20 DAES {€ 9284280» re S,-13 mec hay 1s. 


ey kn es) v Coo ww nea Bea 


By virtue of the orthogonality of the S4> each coefficient 
Ps is invariant in the Sy approximations’ for kiz 1=0,2,... 


The limiting approximator, 


mo called the Fourier E-coefficitent expansion of s, and the 
eoefficients Pp, = (see,) are the Fourier E-coefficients. 

As implied above, a Fourier E-coefficient expansion of 
S has the property that for each successive k = 1, 2, ...; 
the Si approximator formed of the first k terms is "best" 


in the sense that there is no other vector "closer" to s 


is at least 


in the subspace 5S Implicit here is that s 


k* k 


as good as Sy-1° 


Te 





"Closest" as used here is in the sense of Euclidean 
distance or the norm of the difference between the two vectors 
s and S,° "Best" implies that Si is the closest of all Si 
approximators to s. To formalize the notion, the distance is 


given by 


\sa— Sl 7 (s - S,) °e(s - S1,) 


To prove that S; is the best Sy approximator of s in the 
norm, choose an arbitrary Si. approximator 
k-1 
si =e joe 3 
eo 
and determine the coefficients Ps which makes the norm 


Ils - sj|| smallest, or equivalently, minimizes || s - stll° 


a? 


aS 
ls - sil" = (s-s!) © (s - sp) 
k-1 k-1 k-1 
= OS pA pi(see ) + ff » pipi(e ce.) 
jeo I ~~T geo gro 15 Mt 9 
5 k=-1 k=-1 5 
- sil? -2 2 pip, # 2 OH 
jzo 2d j=0 
= k-1 
= lls? + = (p!-p,) - = (p,)? 
= J J = J 
j=0 
Thus P, = Py = S°S4> j = 0, 1, 2, ..., K-1 is the coeffi- 
cient set which results in the best 8S, approximator. 


k 


ie 





The limiting case of the best Ss, approximator does not 
imply that = Ils - s,, || = 0. It is conceivable that we 
can find an Ss which possesses components which are orthogonal 
to every vector in the infinite set E. One example is the 


infinite system 
E = { = » 5, Sin(x) or slit 2) morte } 


Spanning the space of continuous functions defined on 
—TSxa7 . E is orthonormal and infinite, yet the best Sy 
approximator for f(x) = Acos(x) is zero for all k. This 
introduces the notion of completeness. An infinite ortho- 
normal system E is said to be a complete orthonormal system 
if for every s¢S, the norm | - s,,|| 70 as kv 

To this point the discussion has been limited to contin- 
uous signal and infinite dimensional signal (vector) spaces. 
The results can be modified to cover finite dimensional, say 
N, signal spaces which are not unbounded, that is, the set 
of N-dimensional vectors whose elements are real numbers 
possible obtained by sampling the value of continuous 
functions at N equally spaced points on the interval 
a<x<b. It is implicitly assumed that the constraints 
mlaced by the sampling theorem are met. 

We define a discrete inner product operation on the 


Space Sy as 





where it = (f fs» eee yg fy-1) and G. = (Eo> 81> nae 6 65 Ey_1)- 


N 0? 
Let Dy = {dy> d,> oe dy_yt be a discrete N-dimensional 


orthonormal system spanning §$ The gd, are 1xN vectors 


N° 
satisfying 


g4°ds € eae > 84 


For any signal vector in Sy, say § = (Sos Sys «+s; Sy_1) 


there is a best 8, approximator in the norm given by 


k 
k-1 
s. = ry p,d. FOr ale ee en 2a. ee, N 
i=0 
where 
T N-1 
P, =~ sed, = SQ ee S,oa5 : 


The N-dimensional system Dy is said to be a complete 


discrete orthonormal system if for every ses the norm 


N? 
ls — 3, | = 0. 

In the above discussion, general orthonormal systems 
spanning continuous and discrete signal spaces have been 
considered. Nothing has been said about which orthonormal 
system or basis may be best suited to representation of a 
certain category of signals in the space. 

A given signal, v, in the discrete signal space Sy 


possesses best 8S, approximators in every orthonormal basis 


k 
in Sy However the best Sy approximator in one basis will 
in general posses a greater or smaller norm error than the 


best S,. approximator in another basis. Since there is an 


k 


15 





infinite number of signals possible in Sy the determination 
of the best orthonormal basis to represent a particular 
category of signals by a truncated series, that is a Sy. 


approximator, is more than a casual matter. 





APPENDIX D 
TABULATION OF NUMERICAL RESULTS 


“SIGNAL SPACE 6-11 TO Il@-le 


-FeRATIO VECTOR. NR SIG PER CLASS = 25 NR CLASSES = 28 
N GLOBAL MEAN F = RATIO RANK F = RATIO 
1 2222070307 .55855E+01 28 2e93782EtG4 
Peesee6lsesl2. eS ICSE! Clee s toon ee 
5S 2323846925 .57926EtG1 29 = 89GG4E4+04 
4 .5410874228 .80359EtG1 26) e28G17ErO4 
Dee o47T lAS4Ie eciopceryl 539 ~5O@G5GE+GA 
Se el21525519 WS6SASETO! oD eA465S7ET V4 
T eSASISSITS)§ .T5660E+01 24 .SIJESEtGDA 
8 2347734392 .35340E+01 51 ,SUS6IEFD4 
9 2343847632 .599355E+D1 41 .22867E+54 

Ho 6942852029 S713 5E+91 45 .22e955EtrD4 
mueeeotioglS4l 118259802 42 42h oD EF G4 
fe 2eSACSI5SE5 86677 TSEFD] Zoe oc ie oe On 

Bie ee 0442577T12 .65022E0!1 539 .I8825E754 

14 344140649 ,19725E+91 46 .18669E704 
Id 2544668031 .58025Er1 AAR. eee 
16 «238440035975 .7992TEtO! ce) Oe LG BO5EABA 
17 2341718793 .29640E+62 932 e16522EF04 
‘183 2997812543 ~148092E+83 58 .2lSG63E+04 
19 2336386681 25212583 53. slooota 4 

28 3534472597 .66148E+83 45 Og 15855E4AGA 

21 243346993839 .1803358E+4 54 1291 1EtG4 

22 «2358437557 J 1S895E+G4 ST 2124442404 

Bee «6 O4GAG4SIS «=O L2L 429 EFO4 568 ell i> iaev4 

PA 69968015625) «=o SI DGS EF4 ol eI035eEeGA 

22 e967402315 49567E+B4 55 .95942E+03 

C6 2 D88H39996) .SEAITE+BA 46 .389299E+83 

Meo la code) 61551 TETO4 CBO .S6148E4+03 

28 e401757836 937825404 AT ,. 294268463 

29 ee 4ANB632874 = 4 69094E+G4 I9 oe 252128+03 

539 .2411835963 .5@050E+G4 48 .e25078E+63 

531 .419829305 ,.38961 E64 18 6g L42892E4+03 

TABLE Dl. Feature Selector (F-Ratio) Test on Signal 


Samples of 20 Classes (6-11 to 10-12) 
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SIGNAL SPACE (CONT) 


32 .485371870 .165228+04 
33) 69958808293 64 13S857E+04 
34 23816480673 .1291T1E+64 
35 3865371048 .959425+03 
36 4846074224 .J1LIT57E+G4 
37 2322753996 .12444E+04 
3S 2298769534 ,166063E+O4 
39 2278624995 .18825E+O4 
4G .242421865 .186695+04 
Al .213886738 .22867°E+2¢4 
AQ .185917914 .215555+64 
43 2158691436 .22955E+94 
44 .1333280332 ,1IT7ITLEFGA 
45 .168&623142 .13855f+04 
46 .@86152345 .8§92998+93 
AT ,.@66230476 ,50426E+03 
48 .84871@0942 .25078E+03 
49 .936562510 .1646815+93 
58 .829863276 .449188+92 
51 227402349 .17336E+02 
52 026562581 .168588+02 
55) = 8273824218 .68629E+81 
54 .827246099 .f482sktOl 
55 .@271875894 .61496E+91 
56 20274218735 .639498+01 
57 827285166 .69157E+9!1 
58 827148437 .68517E+0! 
59 .@274668948 .56056E+9] 
60 .0@27226567 .628618-51 
Gre «9272255980 .63207E+5] 
62 .827382819 .68556E+6H1 
63 28427246099 .S1l1l42b61 
TABLE Dl. (continued) 


78 


eo lL BAB1LE+B3S 
eAASISEFD?S 
e296 405+982 
e LTS 56E+ B82 
o 1IS6858E+ 82 
e 1O259E+C2 
eSTTTILEFSI] 
08 43528E+91 
80339 E81 
0 /9927E+B1 
e TI6EGE+DI 
eo §9157&+01 
o 68629 E14) 
eS8517E+81 
eS71S5E1O]1 
e65222E+0]1 
S39 40E+ 81 
© $3297E+681 
eS2891E+01 
oS1496E+81 
eS1142E+G1 
o 6D556E+ 81 
oP S955EtH 1 
02 80235E+81 
ool 925272 
oD 7TZ295E+B1 
e 96B56E+B1 
oD IGS5SE+D1 
eS6ODEGE+GI 
eS DSZ4BE+G 1 
2 1DDGETO! 
e197T2Z5E+D1 





FOURIER 


F-RATIO VECTOR. 


N GLOBAL MEAN 
1 158854395 
2 828901598 
5 8878352460 
4 .883212961 
(5 892258407 
6 029839599 
T 889358598 
8 009244518 
9 eb60C1951E2 
18 =. 3OH0114297 
ll 4 = .9988877852 
12 6 9008854494 
[3 8230939765 
14 .8@890350612 
15 ,~@88822385e 
16 .BPVISBISI1I5 
17) =.8@808315189 
IS .@06911138 
19 ~8880908543 
28 8920898985 
21 888909220 
22 «=e VOBBISS3B 
25 eUO2906571 
24 ~BIIZOE54E 
Come JIO00I89S 
26 6908025549 
ot 880385772 
26 »BB0B804928 
29 =e MIVBL5287 
$8 989905183 
S31 8808004341 
TABLE De. 


6-11 


TO 18-12 


NR SIG PER CLASS = 


F - RATIO 


e63360E+ 25 
e497T42 E+ 05 
oS4962 +05 
ec l429Eta5 
oe SS6S5EtD4 
el 4D1GE+O4 
oS 68356E+B3 
el 7T92TE+B3 
651615482 
ec S8AGETR2 
o LS88SEt+ G2 
© L4263Et+ G2 
eDISTAE+BI 
of 15G4E+G2 
42261E+01 
e SSZ95ETO1 
eIZ2134E+91 
oS4680E+61 
ee l4Skt+Ol 
@eS25T71EtO] 
o IS0S4E+0 1] 
© 49B47E+D1 
eDD TL UEt+G) 
eo S858 EAI 
eo S641 4E+3] 
© (94468481 
eATOT4AE+SB1 
eo S8SSETDI 
oS2960E+291 
09635 4E+ 91 
o2I476E+G91 


RANK 


WOM ANTOD UW BL we Ch 


19 


22 


Fe= RAT IG 


e64962E+ 85 
0 §33560E+ 05 
e49TA2ZE+B5 
ecl42e9h+05 
e SSE6SEtOS4 
e L4010r+04 
eo 68S56E19S 
el 79S27E+9S5 
eS5161 E02 
ec S8AGETBZ 
el 4e635E+02 
o 1 S885E+G2 
el L504E+82 
e96354E1+ 21 
o84680E+G1 
eo 79446E+ 01 
e /50354Et+O1 
oS2oT7TLEt ol 
eDTSTAETO] 
oI (BOEt+H1 
eI SB858ETB1 
eo SSSSETH1 
eI2lTA43E+O1 
e215 4E+D 1 
eASB4TE+Z 1 
eATDT4E+O1 
e42061E 91 
eS SB95Et+B 1 
eS6414E+01 
eS2CO6OE+GSI 
ec 4TGETD1 


Feature Selector (F-Ratio) Test on Fourier 


Magnitude Coefficients of 20 Classes 
(6-11 to 10-12) 


19 


NR CLASSES 





WALSH Slat O }O=-12 


F-RATIO VECTOR. NR SIG PER CLASS = 25 NR CLASSES = 2G 


N GLOBAL MEAN F - RATIO RANK F = RATIO 
1 .421459854 .94382E+O3 5S e1S845E+O5 
Me-eciGoTl8s? .16822Er92 G6 .29463E+084 
5 188234492 .13845E+@5 9 .IS3SIGEFO4 
4 .181912975 .le2e7Tdetro4 25 614192E+04 
5 ~e1BZ8674778 .95SEBLEtDS 2l 13941 E+B4 
6 -.158544034 .29463E+04 A .~12270E+@4 
7 952469350 .18585E+04 ID =.1O815SE+DA 
G8 --8227125660 .9235422+05 T el1O585E+O4 
9 -.814492664 ,I193516E+D4 26 £SDT4DE+OS 
19 -,0@28861068§ .10815E+G4 I 4943582E+B3 
J! -.945914553 =. 45280E+93 2 e95880EO3S 
12 6824911135 .48538E+a3 8 92542E+93 
13 -,863686222 .93S3HE8ED2 25 26 81742E+D3 
14 -.985314587 A431 8GEtO3 DT eo S9TS BEDS 
15 .812225481 .12882E+82 24 .67505E+03 
16 -.8@15699737 .52419E+G2 58 =e DABBSBETDS 
17 -.816392939 .49295E+83 17) 6 49295E+83 
16 -.811567556 .24585E+03 If fp AGSSSEtDS 
19 -.818360678 .40562E+G3 CT ~44955E+83 
26 ~.026983095 .43928E+63 20 .4392EE+035 
21 -.814538482 13941 EtE4 110 6 432808423 
22 *eO1BO4ATTI5 256 46Et+B3 14 ,43518GE+33 
PS = VIF147125 .81TAZEEOS IS 6 40562: 03S 
24 -.815672214 .67505E+3 2 =e SSSISE AS 
22 —~eH12974054 .,14192E+04 25 =e STOAILETZS 
26 -.8167253857 8 .S9574DEtS3 296 =e STDT4EtF ZS 
2l -.926574213 44935E+93 SO 548S2EtDS 
28 808189458 .S8519E+C3 Doe oc lo oG Lae 
29 72954384355) .7992T7E+ G2 599 ep C6LSOETSS 
30 -.246341380 .34832E+83 C2 el D6AGETLS 
31 883344229 02 4585Et+ 03S 


oS5294EtO1 1g 


Feature Selector (F-Ratio) Test on Walsh 


TABLE D3. 
Coefficients of 20 Classes (6-11 to 10-12) 


80 





WALSH Gel! TO 18-12 CCONT) 
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Feature Selector (F-Ratio) Test on Haar 


TABLE DA. 
Coefficients of 20 Classes (6-11 to 10-12) 
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N GLOBAL MEAN F - RATIO RANK F = RATIO 
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TABLE D5. Feature Selector (F-Ratio) Test on Fourier 


Magnitude Coefficients of 79 Classes 
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TABLE D6. Feature Selector (G-Variance Ratio) Test on 


Fourier Magnitude Coefficients of 79 Classes 


85 
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NR CLASSES = 


F-RATIO VECTOR. NR SIG PER CLASS = 25 79 
N GLOBAL MEAN F - RATIO RANK F - RATIO 
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TABLE D7. Feature Selector (F-Ratio) Test on Walsh 


Coefficients of 79 Classes 
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